Edge dependent pathway scoring for calculating semantic similarity in ConceptNet

نویسندگان

  • Steve Spagnola
  • Carl Lagoze
چکیده

Most techniques that calculate the relatedness between two concepts use a semantic network, such asWikipedia, WordNet, or ConceptNet, to find the shortest intermediate pathway between two nodes. These techniques assume that a low number of edges on the shortest pathway indicates conceptual similarity. Although this technique has proven valid in conforming to psychological data, we test the usefulness of additional pathway variables in ConceptNet, such as edge type and user-rated score. Our results show strong evidence for the application of additional pathway variables in calculating semantic similarity. 1 ConceptNet Pathways ConceptNet 3 is one of the largest commonsense semantic networks in existence, relying on its users to make conceptual assertions and collectively vote on the legitimacy of other users’ assertions. ConceptNet is valuable as a semantic resource because it suggests transitive inference between ideas, enabling dissimilar concepts to share a semantic, indirect relationship. Unlike Wikipedia and WordNet, the edges in ConceptNet contain additional semantic information between two concepts. Each edge is assigned a relation type (such as ”Is A” or ”Located At”) and a score that correlates to how well ConceptNet users believe in the validity of the relation [Havasi and Alonso (2007)]. Previous work on calculating semantic relatedness between two concepts ignores these extra edge features, using only the inverse of number of edges on a short path [Rada and Blettner (1989); Wubben and A. (2009)]. Instead of only looking for the shortest pathway from one concept to another we assess all pathways in measuring the semantic similarity of an association. A simple example is shown in Figure 1: two nodes (cat and dog) with two pathways containing varying intermediary nodes and edge types. Figure 1: Transitive links between nodes in ConceptNet may occur through a variety of pathways, some being more appropriate than others.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A semantic relatedness metric based on free link structure

While shortest paths in WordNet are known to correlate well with semantic similarity, an is-a hierarchy is less suited for estimating semantic relatedness. We demonstrate this by comparing two free scale networks ( ConceptNet and Wikipedia) to WordNet. Using the Finkelstein353 dataset we show that a shortest path metric run on Wikipedia attains a better correlation than WordNet-based metrics. C...

متن کامل

A semantic relatedness metric based on free link structure (short paper)

While shortest paths in WordNet are known to correlate well with semantic similarity, an is-a hierarchy is less suited for estimating semantic relatedness. We demonstrate this by comparing two free scale networks ( ConceptNet and Wikipedia) to WordNet. Using the Finkelstein353 dataset we show that a shortest path metric run on Wikipedia attains a better correlation than WordNet-based metrics. C...

متن کامل

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...

متن کامل

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...

متن کامل

A Novel Information Theoretic Framework for Finding Semantic Similarity in WordNet

Information content (IC) based measures for finding semantic similarity is gaining preferences day by day. Semantics of concepts can be highly characterized by information theory. The conventional way for calculating IC is based on the probability of appearance of concepts in corpora. Due to data sparseness and corpora dependency issues of those conventional approaches, a new corpora independen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011